{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src='http://www-scf.usc.edu/~ghasemig/images/sharif.png' alt=\"SUT logo\" width=200 height=200 align=left class=\"saturate\" >\n",
    "\n",
    "<br>\n",
    "<font face=\"Times New Roman\">\n",
    "<div dir=ltr align=center>\n",
    "<font color=0F5298 size=7>\n",
    "    ML for Bioinformatics <br>\n",
    "<font color=2565AE size=5>\n",
    "    Computer Engineering Department <br>\n",
    "<font color=3C99D size=5>\n",
    "    Homework 2: Practical - Support Vector Machines <br>\n",
    "<font color=696880 size=4>\n",
    "    Ali Shafiei (shafieiali42@gmail.com) <br>\n",
    "    Ali Salmani (alisalmani200149@gmail.com)\n",
    "\n",
    "____\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Full Name : \n",
    "### Student Number : \n",
    "__\n",
    "\n",
    "*It is highly recommended to read each codeline carefully and try to understand what it exactly does. Best of luck and have fun!*"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Support Vector Machines (SVM)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import matplotlib.pyplot as plt\n",
    "import seaborn as sns\n",
    "from sklearn.svm import SVC"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In this assignment, we are going to implement Support Vector Machines (SVM) algorithm that determines which patient is in danger and which is not."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.read_csv(\"Liver_Disease.csv\") "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Pre-Processing"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Exploratory Data Analysis (EDA):\n",
    "In statistics, exploratory data analysis is an approach to analyze datasets to summarize their main characteristics, often using statistical graphics and other data visualization methods.\n",
    "\n",
    "This is a general approach that should be applied when you encounter a dataset."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "###############################################################################\n",
    "## TODO: Find the shape of the dataset.                                      ##\n",
    "###############################################################################\n",
    "\n",
    "\n",
    "###############################################################################\n",
    "## TODO: Check if there is missing entries in the dataset columnwise.        ##\n",
    "###############################################################################\n",
    "\n",
    "###############################################################################\n",
    "## TODO: Check whether the dataset is balanced or not.                       ##\n",
    "###############################################################################\n",
    "\n",
    "###############################################################################\n",
    "## TODO: plot the age distirbution and gender distrbution for both group    ##\n",
    "## of patients.(4 plots)                                                    ##\n",
    "###############################################################################\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Question: What do you conclude from the plots?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Outlier Detection & Removal\n",
    "Check whether we have outliers in the data. If there are, delete them."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "################################################################################\n",
    "## TODO\n",
    "################################################################################\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Feature Engineering:\n",
    "Sometimes the collected data are raw; they are either incompatible with your model or hinders its performance. That’s when feature engineering comes to rescue. It encompasses preprocessing techniques to compile a dataset by extracting features from raw data.\n",
    "also feel free to do more feature engineering techniques if needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "################################################################################\n",
    "## TODO: Normalize numerical features to be between 0 and 1                   ##\n",
    "## Note that just numerical fetures should be normalized.                     ##\n",
    "################################################################################"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SVM"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### spliting data"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "## TODO Split the data into test and training sets.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### SVM using Scikit-Learn:\n",
    "First of all train an svm model with default parameters and report its."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#########################################################################################\n",
    "## TODO\n",
    "#########################################################################################\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Grid Search\n",
    "Use Grid search and validation set to find the best parameters for your SVM model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#########################################################################################\n",
    "## TODO\n",
    "#########################################################################################\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Train an svm model on the entire training data using the parameters you found in the previous step."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#########################################################################################\n",
    "## TODO\n",
    "#########################################################################################\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Confusion Matrix\n",
    "Plot the confusion matrix and report the model accuracy on test set.\n",
    "What does each entry of the confusion matrix mean?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#########################################################################################\n",
    "## TODO\n",
    "#########################################################################################\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Add some outliers to the dataset, train an SVM and logistic regression model, and compare the results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "#########################################################################################\n",
    "## TODO\n",
    "#########################################################################################\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
